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PROBABILITY LEARNING: FIRST-ORDER MARKOV STRUCTURES 


OF QUARTERNARY EVENTS 
By Edward M. Huff 
Ames Research Center 


SUMMARY 


In many control tasks, man is required to make decisions based on a 
subjective evaluation of the likelihood of future events. It is necessary to 
examine, therefore, the degree to which predictive behavior corresponds with 
the objective probabilities of events, and the degree to which man can learn 
to use information contained in event sequences. 

In the present study, 15 groups of subjects were exposed to different 
sequentially dependent stimulus sequences in a four-alternative probability 
learning paradigm. Groups were found to differ in both learning rate and 
terminal response levels as a joint function of the stochastic structure and 
the degree of dependence in the stimulus sequence. Events with common struc- 
tural properties in different sequences were learned in a similar fashion. 


INTRODUCTION 


The classic probability learning paradigm in which subjects predict suc- 
cessive events may, in principle, be used with any finite number of stimuli 
and any stimulus generator. Most research, however, has been restricted to 
binary situations in which events are sequentially independent (ref. 1). The 
few studies that have been reported with Markovian binary structures 
(refs. 2-5) have found that humans learn first-order event dependencies by 
responding to short stimulus-response sequences that precede their predictions, 
although all such sequences are not learned equally well. The manner in which 
subjects learn to predict four-alternative event sequences generated by first- 
order Markov processes (ref. 6) is reported here. 

In one of the few studies using more than two dependent events, Bennett, 
Fitts, and Noble (ref. 7) found that a five-alternative Markov process that 
generated diagrams n concordant M with previously determined response prefer- 
ences was learned more quickly than a similar stimulus process that generated 
"discordant 11 diagrams. These generators were of a special type in which only 
two nonzero (and in this case equal) transition probabilities were allowed in 
each row of the Markov matrix. The nonzero entries defined two "appropriate" 
choices following each stimulus. Although the authors concluded that response 
preferences are important in probability learning, their use of concordant 
and discordant diagrams was confounded with structural differences between the 
generators (ref. 8). Thus, the fact that the concordant generator created 


different kinds of event sub- sequences than the discordant generator (e.g., 
repetition and alternation runs) could have accounted for their results. It 
is difficult to evaluate this hypothesis since the response vector for the 
two appropriate alternatives was not reported; however, the discordant genera- 
tor did not suppress the proportion of appropriate choices below chance level 
during the early stages of learning. As the authors themselves note, such 
suppression would be expected if response preference were the controlling 
variable. Their suggestion that between-group differences resulted in the 
discordant sequences being effectively "neutral" is clearly speculative. 


DESIGN AND PROCEDURE 


The quarternary Markov matrices used in the present study were restricted 
to those that are doubly stochastic and contain one high transition probabil- 
ity a (>0.25) in each row, the remainder of the probabilities being equal to 
3( = C 1 - a)/3}. In such matrices, each event has a high probability (a) of 
being followed by some particular event (perhaps itself) and, provided that 
a < 1, each event will asymptotically appear with equal relative frequency. 
Predicting the event specified by the a probability will be called a 
profitable 1 prediction. 

What will be referred to as the structure of the process is the general 
arrangement of the a probabilities within the 4 X 4 matrix. Since response 
preferences for individual stimulus diagrams were not considered, exactly five 
structures were of interest. These correspond with the unordered partitions 
of four indistinguishable events: 

51 = (E i )(E j KE k )(E Jl ) 

5 2 = (E i )(E j )(E k ,E A ) 

5 3 = (Ei)(E j ,E k ,E £ ) 

5 4 = CE i , Ej ) (E k , E^) 

55 = > E j , E^ , E^) 

where i f j f k ^ and the subscripted S's and E's refer to the struc- 
tures and events, respectively. For each structure, each subset of events 
tends to generate a most probable sub-sequence in which the elements cycle 
among themselves. For example r- 

ot 3 3 3 

c _ B B a B 

3 3 3 a 

3 a 3 3 


^runswik's terminology (ref. 9) is expanded here. 


2 



I 


corresponds with any matrix in which one of the events, E^, has a high proba- 
bility of being followed by itself, and each of the remaining three events, 

E j , E^, and E^, has a high probability of being followed by a different event 
within the subset. 

Since all the events are members of some structural subset, each event 
may be identified with a class C n (n = 1, 2, 3, 4), where n is the number 
of elements in the subset and, hence, the number of elements in a most proba- 
ble sub-sequence cycle. Structure S 3 may alternatively be regarded, then, 
as containing two classes of events, Ci and C 3 , which tend to create the most 
probable sub-sequences ... E^EiE^ ... and ... EjE^E^EjE^ ..., respectively. 
Inspection will reveal that each Si, S 2 , and S 3 contains Ci events, and 
each S 2 and S 4 contains C 2 events. Structures S 3 and S 5 alone contain 
events in classes C 3 and C 4 , respectively. 

The advantage of using these particular restrictions (ref. 10) is that 
for fixed a, the redundancy (ref. 11) of all five structures is equal. Fur- 
ther, independent of a, the marginal (zero- order) uncertainty of all struc- 
tures is maximal (i.e., two bits). Hence, not only must the subject learn 
first-order dependencies for performance to improve, but differences in behav- 
ior as a function of the structure variable are isolated from the degree of 
redundancy in the stimulus sequence. 

Three a values, 0.4, 0.55, and 0.7, were used in the present study 
because they are sufficiently different from 0.25 and 1.0 to allow departures 
from either change or maximizing 2 behavior to be assessed; they also allow a 
reasonable range of stimulus dependency to be examined. At each level of a 
a single sequence of 500 events was generated according to the rules of each 
stochastic structure. Sequences were required to have marginal event frequen- 
cies and transition frequencies governed by the a probabilities, within 4 
and 5 percent of their expected values, respectively. Transition frequencies 
governed by the 3 probabilities were allowed to vary from 10 to 20 percent 
as an inverse function of a. In all cases, the empirical matrices were 
judged to be highly representative of the corresponding theoretical process . 3 

Each of the 15 event sequences (five structures for each of the three a 
values) was recorded on paper tape and used to control the temporal sequence 
of four visual stimuli (+, -, o, x) to a different group of subjects. Ten 
college students between the ages of 17 and 28 were randomly assigned to each 
group and were paid at a fixed hourly rate. In order to distribute the 
effects of possible symbol diagram preferences, the four events recorded on 
the control tape were randomly identified with a different symbol ordering for 
each subject. 

2 Such behavior would be obtained if subjects optimized the number of 
profitable predictions by invariably choosing alternatives specified by the 
a probabilities. 

3 As 3 diminishes the expected marginal frequency for each unlikely 
alternative approaches zero. The proportional change due to even a single 
frequency inversion, therefore, becomes quite severe and requires special 
treatment. The Anders on- Goodman contingency test (ref. 12), nevertheless, did 
not reveal significant departures from the theoretical matrices at the 0.01 
level of confidence. 
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Subjects were tested two at a time in separate booths and instructed to 
predict which symbol would appear next on a small rear projection console 
screen. They made their predictions by pressing one of four buttons, each 
identified with one of the symbols, during a 1.5 second interstimulus interval 
identified by the onset of a yellow panel light. Each symbol appeared for 
4 seconds. The selected response button lit up following a prediction and 
remained on until the next symbol appeared. 4 


RESULTS 


A separate learning curve was first computed for each event, E, in each 
structure by determining the average percentage of profitable predictions made 
by the 10 subjects during each block of 12 successive E occurrences. 5 Since 
all events occurred at least 120 times in a given sequence, this procedure 
guaranteed a minimum of 10 blocks of equal statistical precision for each 
curve . 

Figure 1 presents learning curves for stimulus classes C n (n = 1, 2, 3, 
4) averaged by block over structures and subjects. Figure 2 separates the 
learning curves by structure for each of the two event classes which are com- 
bined over structures in figure 1, that is, C ± and C 2 . The statistical pre- 
cision characteristically varies from curve to curve as a result of the 
unequal class membership distribution between the structures. In all cases 
where two or more events within a structure belonged to the same class, how- 
ever, no meaningful differences between the individual learning curves were 
noted. 

The present data support the hypothesis presented earlier concerning the 
importance of the structural variable in probability learning. They show 
that quarternary structures, and in all likelihood n-ary structures in gen- 
eral, involve multiple learning rates and asymptotic levels (see fig. 1) which 
are a joint function of the structural properties and the redundancy of the 
controlling stimulus generator. Furthermore, although some degree of within 
structure interaction is reflected in the learning rates (see fig. 2), for a 
given a the terminal response level for an event is mainly determined by the 
sub-sequence class of which it is a member, rather than the particular stimu- 
lus structure in which it is imbedded. Indeed, these asymptotic levels are. 


4 The presentation interval was made quite long relative to the response 
interval in order to induce the subject to prepare his response while examin- 
ing the prior stimulus. The subjects, therefore, may have found it easier 
than subjects in previous studies (ref. 13) to learn first-order associations. 

5 The common analytic technique with binary sequences is to examine blocks 
of arbitrary event occurrences. This procedure is most convenient when the 
diagram frequencies have also been forced into theoretical agreement for each 
block, and it is identical with the present procedure when marginal frequen- 
cies are equal. Here (in order to avoid unnecessary higher-order dependen- 
cies), diagram frequencies were not restricted within blocks. 
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as a rule, different from a and support Hake and Hyman* s contention (ref. 2, 
pp. 72-73) that probability matching, when observed, is the fortuitous result 
of a complex learning process. No tendency for matching was found in the 
present data. There is some indication, however, that with sufficient stimu- 
lus dependency. The terminal proportions of profitable predictions would 
converge to a common level for all event classes (see fig. 1(c)). 


DISCUSSION 


Certain interrelated factors bear on the present findings. These are 
mentioned briefly in order to point up similarities with other classical learn 
ing situations. First, as the number of stimuli in an event class diminishes, 
the occurrences of each component stimulus become clustered within the 
sequence. The results, therefore, may reflect merely a relative advantage of 
massed versus distributed practice in a complex task (ref. 14). Second, as 
the number of stimuli involved in a structurally preferred sub-sequence is 
reduced, the potential of each component stimulus to develop conflicting 
remote associations is diminished, provided that associative strength is 
inversely related to distance within the sequence (ref. 15). Both factors are 
complicated, however, by the fact that most probable sub-sequences involving 
the greatest number of events also have the greatest number of starting 
(entry) points, as well as the greatest potential for atypical reentry into 
their own event set. Hence, the present results would be expected if one 
posits, as Hake and Hyman (ref. 2) suggested, that specific stimulus n- tuples 
rather than the probabilistic structure of the stimulus generator, are 
learned. In this case a different n-tuple would be associated with each 
starting point in the sub-sequence. It is also true, however, that behavioral 
adjustment to unexpected contingencies is often difficult because of mental or 
motor set (refs. 16, 17, 18). Even if subjects did perceive the probabilistic 
properties of the generator, then, one would expect a differential performance 
decrement between event classes as a function of the number of atypical 
sub-sequence reentries. 

These arguments may be somewhat difficult to untangle, but it is clear 
that many of the same variables are operating here as in the more determin- 
istic serial and paired- associates learning paradigms. It may be noted, more- 
over, that in the extreme (a = 1-0), each structural subset represents a 
serial learning task. The present findings would appear to be, therefore, the 
probabilistic analog of the well documented inverse relationship between 
serial list length and learning speed (ref. 14) . 


Ames Research Center 

National Aeronautics and Space Administration 

Moffett Field, Calif., 94035, April 8, 1969 
127-51-09-01-00-21 
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(c) a = 0.7 
Figure 1.- Concluded. 
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Figure 2.- Learning curves for event classes C l and C 2 averaged separately 

by structures for each a level. 
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(d) C 2 events , a = 0 
Figure 2.- Continued 
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(f) C 2 events, a = 0.7 
Figure 2.- Concluded. 
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